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Abstract. We study depth properties of a general class of random recursive trees where 
each node i attaches to the random node [iXi] and Xq, . . . , X„ is a sequence of i.i.d. random 
variables taking values in [0, 1). We call such trees scaled attachment random recursive trees 
(sarrt). We prove that the typical depth D„, the maximum depth (or height) H„ and the 
minimum depth M„ of a SARrt are asymptotically given by D„ ~ logn, H„ ~ amaxlogn 
and Mn ~ Qmin log n where fi, amax and amin are constants depending only on the distribution 
of Xo whenever Xq has a density. In particular, this gives a new elementary proof for the 
height of uniform random recursive trees Hn ~ e log n that does not use branching random 
walks. 



1. Introduction 

A uniform random recursive tree (urrt) Tn of order n is a tree with n + 1 nodes labeled 
{0, 1, . . . , n} constructed as follows. The root is labeled 0, and for 1 < i < n, the node labeled 
i is inserted and chooses a vertex in {0, . . . ,i — 1} uniformly at random as its parent. The 
asymptotic properties of T„ - the depth of the last inserted node, the height of the tree, the 
degree distribution, the number of leaves, the profile and so forth - have been extensively 
studied starting from Gastwirth [18], Moon |23] and Na and Rapoport [23]. In particular, 
Szymahski [3TJ showed that the depth Dn of node n is (1 + o(l)) logn with probability going 
to 1 and Pittel ^26j proved that the height Hn is (e + o(l)) logn with probability going to 1. 
Distance measures in a URRT were also considered by Dobrow [13], Dobrow and Fill [T3], Meir 
and Moon [22], Neininger [25] and Su et al. [29]. For a survey, see Drmota |15j and Smythe 
and Mahmoud [28]. 

A natural generalization of this model introduced by Devroye and Lu [llj is to let a vertex 
choose k > 1 parents uniformly. This construction defines a random directed acyclic graph 
(A;-dag), which was used to model circuits Arya et al. [2], Tsukiji and Xhafa |32| . 

The uniformity condition was relaxed by Szymanski [30] by letting the probabilities of 
being chosen as a parent depend on the degree of the parent. When the probability of linking 
to a node is proportional to its degree, this gives a random plane-oriented recursive tree, the 
typical depth of which was studied by Mahmoud [20] and the height of which was studied by 
Pittel [26]. When A: > 1 parents are chosen for each node, the popular preferential attachment 
model of Barabasi and Albert [3j is obtained. 
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Motivated by recent work on distances in random A:-DAGs (Devroye and Janson [TOj) and 
on the power of choice in the construction of random trees (D'Souza et al. [16j, Mahmoud 
|21j). we introduce a generahzation of uniform random recursive trees. In a scaled attachment 
random recursive tree (sarrt), a node i chooses its parent to be the node labeled [iXjJ where 
Xq,Xi, . . . ,Xn is a sequence of independent random variables distributed as X over [0, 1). 
Note that the choice of the parent here only depends on the labels of previous nodes and 
not on their properties relative to the tree (like the degree, for example). In particular, if 
X is uniform on [0, 1) we get a URRT. The distribution C{X) of X is called the attachment 
distribution. 

We study properties of the depth (path distance to the root of the tree) of nodes in a 
SARRT with a general attachment distribution. We determine the first-order asymptotics 
for the depth Dn of the node labeled n, the height Hn = maxi<i<„ Di of the tree and the 
minimum depth M„ = min„/2<i<n ^i- O^^r result gives a new way of computing the height of 
a URRT that is not based on branching random walks that were used in previous proofs by 
Devroye [S] and Pittel p6j . 

Furthermore, setting X = max([/i, . . . , Uk) where Ui, . . . ,Uk are independent random 
variables with uniform distribution over [0,1), the depth Di of node i in a SARRT with 
attachment X is the distance given by following the oldest parent from node i to the root in 
a random k-DAG \10\ I21j. This problem can be seen as a "power of choice" question: how 
much can one optimize properties of the tree when each node is given k choices of parents? A 
new node is given k choices of parents, and it selects the best one according to some criterion. 
In the setting of this paper, we study selection criteria that only depend on the labels or 
arrival times of the potential parents. Our results describe the influence of a large class of 
such selection criteria on the depth of the last inserted node, the height and the minimum 
depth of the tree. This holds for a URRT and for almost any sarrt as well. Some examples 
are given in Section [5] 

Outline of the results. In Section [2], we prove a concentration result and a central limit 
theorem for Dn for a very general class of attachment distributions: 

' and ^^^iLiiS4AA(0,l), 



where fi and cr^ are simply the expected value and the variance of — log X, M{0, 1) denotes the 

V c 

standard Gaussian distribution and the symbols — )• and — )• refer to convergence in probability 
and convergence in distribution. This generalizes a result of Mahmoud ^21j. In Sections [3] and 
[4| we prove the main theorems (Theorems [2] and [g]) of this paper: if jC{X) has a density on 
[0, 1), then there exist constants amax and amin such that 

lim = Omax almost surely, and > amin, 

n^oo log n log n 

where Hn and Mn denote the height and minimum depth of the sarrt with attachment X. 
These constants are defined as the solutions of equations involving a rate function associated 
with logX. The proof of these results uses a second moment method. The main difficulty in 
the proof is in controlling the dependencies between the paths up to the root that originate 
from different nodes. We also prove that lim„_5.oo ^\^^ = «max- 
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The different results are applied to study the properties of various path lengths in a 
random k-DAG in Section [5] Lastly, we include an appendix proving some simple properties of 
the large deviation rate functions used. 

v c 

Notation. As introduced earlier, the symbols — )• and — )• refer to convergence in probability 
and convergence in distribution respectively. For random variables X and Y, we write C{X) 

for the distribution of X and X = Y when X and Y have the same distribution. For a general 
random variable X € [0, 1), we define 

/i = E{-logX} > and = Var {- logX} . 

If X has an atom at 0, then fj, = a = +oo. If = +oo, then we define a = +oo. A sarrt 
with attachment distribution C{X) is described by a sequence Xq, Xi, . . . , Xn of i.i.d. random 
variables distributed as X. The parent of node i is labeled [iXjJ . The root of the tree is 
labeled and L{n,j) is the (random) label of the j-th. grandparent of n on its path to the 
root. Note that L{n,j + 1) = \_L{n,j)Xi(^n,j)\ and that L(n, 0) = n. The depth Di of node i 
is defined by Di = min{j > : L{i,j) = 0}. 



2. The depth of a typical node 



We look at the sequence of labels from node n to the root as a renewal process. We have 
Dn = min{j>0:L{n,j)=0} 

= min{j > : L. . . [[nXn\ Xi(,,i)J . . . Xi(„j_i)J = 0}. 

Note that 

nXnX^n,!) ■ ■ ■ ^LinJ-l) " J < |_- • • [L'^-'^nJ ■ ■ ■ ^L{n,j-1)\ < "'-^n-^L(„,l) • • • -'^L(nJ-l)- 

Remark. Since X G [0, 1), we have = E{— logX} > 0. Thus, the following theorem covers 
all the possible cases. 

Theorem 1. 

(A) If fJ, = +00, then ^ and hm ^^^'^'■^ = q. 

log n n^co log n 

Tf , Dn V I ... E{Dn} 1 
(Bj If n < +00, then > — ana hm — = — . 

logn n~^oo log 71 fj, 

(C) If fi<+oo and < < +oo, then ^" "^"^"^^ ^ -^(0' 

(D) If p. < +00 and cr^ = 0, then Dn — logn/ fi = o ( y^logn ) almost surely. 



Remark. Mahmoud [21j proved a similar result using generating functions for the case 
X = max([/i, . . . , Uk) and X = min(C/i, . . . , C/fc). Details are given in Sectionjs] 

Proof. We consider an auxiliary renewal process Rt = sup |j : X]i=i — ^} with interarrival 

c 

times distributed as Zi = — logX for all i. When /j, < +oo, the strong law of large numbers for 
renewal processes gives that Rt/t — )• l//i almost surely (see [23 Proposition 3.3.1). Moreover, 
the elementary renewal theorem implies that E{i?t} /t ^ I/a*- The following claim handles 
the case fi = +oo. 

Claim. For = +oo, liuit^oo ^ = with probability 1 and limf_>oo = 0. 
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Proof. For fixed 6 > 0, let Zi = min(Zj,a) where a is chosen so that E{Zj} > b. Consider 
the renewal process Rt with interarrival times Zi. By the fact that Rt < Rt and the law of 
large numbers for Rt we have, for sufficiently large t, Rt/t < Rt/t < 2/6 almost surely. Since b 
is arbitrary, we have Rt/t ^ with probability 1. The convergence of the expected value is 
proved in a similar way. This concludes the proof of the claim. □ 

We upper bound the depth of node n by 

Dn < min {j : nXnX^^^i) . . . X^n,j-i) < l} 

= min |j : XlCo " ^ogX^n,i-) > lognj =^ Dn- 

^ c 

For n > 1, Dn = -Riogn + 1- So, we have for any e > that 

(1) p|^>i+4<p(j°^>i+eUp|^^>i+4=„(l). 

[ log n /i J I log n fi \ I log n fi } 

Since Dn > 0, equation ([T]) proves part ([A]) of the theorem (by writing l/fi = when /i = +oo). 
Similarly, a lower bound is given by 

Dn > min {j : nX„ . . . X^n,j-i) - J < l} 

> min |j : ^^£0 " ^osXL(n,i) > logn - log j| . 
Let j{n) = |_log^ nj and define the event 



En 



j{n)-l 

(n.i) > log 

i=0 



Using the upper bound Q, we have that P {En} — )■ 1. Also, we have logj < 2 log log n and if 
we define /(n) = logn — 2 log log n, then when En holds 

Dn > min |j : Y^jl^ - logXi(„_i) > /(n)| =^ Dn- 
We have D„ = Rf{n) + 1 for n > 2, and thus, 

(2) p|^<i_4.p|^£M±l.M<i_,U„(„, 

[logn ^ J [ J{n) logn fi J 

by the law of large numbers for renewal processes and the fact that 

M = 1. 

logn 



n—^oo 



Combining ^ and pf with the fact that P {Dn > D^} > P {En} we obtain convergence in 
probability of part dBj of the theorem. As for the expected value, we have for any e > 0, 

{l/fi - e) log n • P {Dn > (1/^ - e) logn} < E {Dn} < E 

which completes the proof of (|B|). 
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By similar arguments using the central limit theorem for renewal processes (see [27]. 
Theorem 3.3.5) we can prove part ([C]) for D„, by showing that 

lim p P^ -l°gW/^ < 4 = <I>(c) and lim P I^IL^M^t < c] = $(c), 
[ 0" A/log n///3 J n^oo 1^ a^/logn/fi^ J 

where <^> is the cumulative distribution function of a standard A/'(0, 1) variable. The result 
follows from the fact that Dn < Dn < Dn with probability going to 1 as n — )• oo. The first 
limit is clear and to show the second limit, write 



Dn - logn/n {Dn - fin) III) + {f{n)/fi - logn/fi 




where we have 

lim /H/ ^-l°g^ /^ = -2\ og\ogn/^ ^ ^ ^ 
'^^'^ a^/f{n)/p? ay^f{n)/fi^ "^^^ cji/logn - 21oglogn/^3 

Also, the central limit theorem for renewal processes implies that 

^.^ p - /(n)/M ^ 1 



n— >oo 



When cj^ = 0, X = e~'^ G (0, 1) almost surely. Then the label of node i parent is [ie~'^J and 
L{n,j) = [[[ne~^\ e~'^\ . . . e~^\ (j times) almost surely. Since ne~^^ — j ^ L{n,j) < ne~^^ 
and for n > no(/i) we have ne~^'^ < 1 when j > logn/ fi and ne~^'^ — j > 1 when j < logn//i. 
Then, we have that \Dn—logn/ < 1 for n > uq. Therefore we get part ( |D| ) of the theorem. □ 

3. The height of the tree 

We turn our attention to the height Hn = maxi<j<„ Di of a sarrt. For a random variable 
Y, we define its cumulant generating function Ay and its convex (Fenchel-Legendre) dual Ay 
as follows: 

(3) Ay(A) =logE|e^^| and A;.(z) = sup {Az - Ay(A)}. 



Since we mostly use these functions for Y = log A", we omit the subscript in this case. We 
write 

(4) A(A) = logE|e^^°s^| = logEjx^l and A*(z) = sup {Az - A(A)} 

^ ^ ^ ^ ASM 

for the cumulant generating function of logX and its dual. It is well known that A*(z) = 
supA>o {A^ - A(A)} for z > E{logA} and A*{z) = supA<o { Az - A(A)} for z < E{logA}. 
This is proved along with many properties of A* used in the paper in Appendix [B} We also 
define 

(5) ^(c) = cA* (-1/c) 
and 

(6) Omax = iiif I c : c > — and ^'(c) > 1 



where we define = when fj, = +oo. Proposition [5] in the appendix shows that the set 
|c : c > ^ and ^'(c) > l| is non-empty, amax < +oo and if X is not a constant, Omax > I/a*- 
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The following theorem sums up the results we prove in this section. 
Theorem 2. The height Hn of a SARRT with attachment X having a density satisfies 

lim - — — = amax with probability 1, and lim — — = amax, 

n-5-oo log n n^oo log n 

where Omax is defined in equation ([6]). 

Remark. It is worth observing that if X is not constant and fi = +oo, then D„ = o(logn) in 
probability as shown in Theorem [l| whereas Hn = 0(logn) in probability. If X = a £ (0, 1) 
with probability 1, then Omax = 1//^ = — l/loga and it is easy to see that the results of the 
theorem also hold in this case. 



3.1 



and 



3.2 



in the case 



We start by proving convergence in probability of in Sections 
of a bounded density. Section 3.1 gives an upper bound for with no condition on X. 



The lower bound we present in Section 3.2 is more involved and uses an upper bound on the 



density in order to bound the dependence between different paths. In Section 3.3, we show 



that the lower bound still holds if X has an unbounded density. Finally, Section 3.4 is devoted 



to proving almost sure convergence and convergence in mean as stated in the above theorem. 



3.1. The height of the tree: upper bound. Based on the bounding techniques of Chernoff 
[U and Hoeffding [19] we can prove the following result. 

Lemma 1. For any c > Omax, we have P {Hn > clogn} — t- 0. 

Proof. To simplify the notation, we prove P {Hn > clogn + 2} — )• for all c > Omaxi which is 
an equivalent statement. For t > 1, applying Markov's inequality, we get 



P{Dn>t}<P {nXn . . . X^n,t-l) > l} 



inf n^E 

A>0 



= inf exp ( A log n + A(A)t 
Setting t = \c log n] , we obtain 

P{Dn > clogn + 2} < inf exp ( A log n + A(A)clog nj , (as A(A) < 0) 

< exp f — sup I ^('^) 1 c log n | 

V A>o I c J J 

= exp (— cA* (—1/c) logn) 

(7) =n-*(^). 
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When ^(c) > 0, the bound in ([7]) goes to 0. Recalhng that c > Ofmax and the definition of 
Omax (equation ([6|), we obtain ^'(c) > 1. Applying a union bound, we get 



(8) 



{Hn > t} = P I max Di > t\ <^F {Di > t} 



< nP {Dn > t} 

< n^-nc) ^ 

as n — )■ oo. Note that the last inequality holds because |_. . . |_[^XjJ -'^L(i,i)J • ■ ■ -'^L(i,t-i)J is 
stochastically smaller than |_. . . |_[nX„J ^^(^,1)] • • • -^L(n,t-i)J for i < n as the sequence (Xj) 
is i.i.d. □ 

In the next section we prove a lower bound on the height of the tree. We show that for 
any c < Omax, there exists a node of depth larger than clogn. 

3.2. The height of the tree: lower bound. 

Overview of the proof. It is worth observing first that the upper bound (Lemma [T]) 
does not take into account the structure of the tree in any way. Introduce the events 
= [D^ > (a max ~ ^) log 'T-] where £ G (0,aniax)- We omit the dependence in e in this 
overview. Applying a second moment inequality sometimes called the Chung-Erdos inequality 
[5], we get 

It is not hard to show that J2x=i ^ i^x} — ^ +oo as n — )• oo. Hence, showing that 

would imply that the right hand side of ^ goes to 1. This would prove the lower bound 
on the height that we seek. Therefore, our objective is to prove that the collisions between 
branches of the tree — that are responsible for the dependence between and Ay — do 
not influence the joint probabilities P {Ax H Ay} by much. In order to be able to control the 
collision probabilities, we add some restrictions to the event Ax- Instead of only looking for 
long paths in the tree, we look for paths that maintain large enough labels at each step. See 
equation (13) for a definition. The probability of such an event can be bounded (Lemma [2]) 



using a rotation argument introduced by Andersen |T] and Dwass |17j and used in the context 
of random trees by Devroye and Reed p2] . 

To simplify the presentation, the proof is carried out first for the case where X has a 
bounded density and possibly a mass at 0, i.e.. 



(10) X 



X with probability 1 — p 
with probability p, 



where C{X) has a bounded density on (0, 1) and p G [0, 1]. The reason we allow X to have an 
atom at is to later handle attachment distributions having unbounded densities (Theorem [4]). 

Preliminary lemmas. We begin by stating precise bounds on the probabilities of events of 
the form [Xi---Xt>b]. 
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Proposition 1 (Cramer [6], see also Dembo and Zeitouni [7], chapter 2, page 27). Let 
Yi, . . . ,Yt be a sequence of independent real random variables distributed as Y and having a 
well-defined expected value E{y} € MU {±00}. For any constant a € M, we have 



P{Yi + --- + Yt>ta} = exp (^-tAy (a) + o{t)j i/ a > E {Y} and E {Y} / +00, 

P{Yi + --- + Yt<ta} = exp (-tAy (a) + o(t)) ifa<B {Y} and E {Y} / -00, 

where Ay is as defined in equation ([s]). 

Before stating the corollary that we need, we define the rate function A* for a random 
variable logX that has an atom at —00. The function ip : X ^-^ Xz — logE |e^^°^"'^} is well 
defined for A > 0. We extend it for A = by ip{0) = - log(l - P {logX = -00}). Then, A* is 
defined by 

(11) A*(z) = sup{(/.(A)} 

A>0 

for all real z > E{logX}. Note that this definition coincides with the definition given in Q if 
P{X = 0} = 0. 

Corollary 1. Let X have an atom at with mass p and any distribution on (0, 1) with total 
mass 1 — p. Let Xi, . . . ,Xt be i.i.d. random variables distributed as X . Then, 

P{Xi...X,>e*n=exp(-tA*(a) + o(t)) f for a > B {log X} i/ E {log X} > -00 
L \ J < \ Jj ]^foraeR «/E{logX} = -00. 

Proof. First if p = 0, we can apply Cramer's theorem to logX and get the desired result. 
In what follows, assume p > so that logX = —00 with positive probability. Let t > be 
integer, and let Xi, . . . ,Xt he t independent random variables having the distribution of X 
conditioned in X > 0. If any Xj = 0, l<i<t, then the product Xi ■ ■ ■ Xt = and thus 

P{Xi---Xt> e*"} = (1 - pf p[Xi---Xt> e*"} 

= (l-p)*p{logXi + --- + logXt >ta}. 

For a > Ejlogx}, we get 

P {Xi • . . > e'-] = (1 - pf exp [-tKl^^{a) + o{t)) 

= exp {-t (a;^^ (a) - log(l - p)) + o{t)) . 

Then, assume E|logx| > —00 and a < E|logx|. Using the law of large numbers for 
log X, we get 

lim P\Xi---Xt> e^"] = 1. 

Thus, 

(1 - pf (1 - 0(1)) < P {Xi • • • > e*»} < (1 - p)' 

which implies 

P{Xi---Xt > e*'^} =exp(tlog(l-p) + o(t)). 
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It only remains to show that 
(12) A*iz) = 



A* ~{z) -log{l - p) for z > E <^ log X 

log X 



log(l -p) 



for z < E <^ log X > . 



Let U he a random variable uniformly distributed on (0, 1) and independent from X and X. 
Consider the event A = [U < p]. Then X = 01^ + Xly^c Thus, for z > EjlogX}, we have 

sup |Az - logE = sup |Az - log E I (01a + XI a^)^}} 



A>0 



sup < Az — log E 

A>0 

sup < Xz 

A>0 

sup < Xz 
A>0 



sup Az — log E <j X 

A>0 



log (e{x^}e{1a4)} 

log (e{x^}(1-^,))} 

II -log(l -p). 



As a result, using the definition (11), we obtain 

A*{z) = max I sup |Az - logE |x^|| - log(l -p), -log(l - p) 



= sup|Az-logE|x^U -log(l-p). 
A>o L J J 

which matches the expression ( 12 ) using Proposition [sj 



□ 



The next lemma is based on a rotation argument introduced by Andersen [1] and Dwass 

HZ! 

Lemma 2. Let t be a positive integer, let f3 > 0, and let Xi, . . . ,Xt be a sequence of non- 
negative independent and identically distributed random variables. Then 

P{Xi >/3,XiX2 >/3^...,Xl•••Xt >/3*} > ^P{Xi---Xi >/3*}. 

Proof. As Xi, . . . , Xt are i.i.d., we can circularly continue the indices: Ya = Ya+t = ^ for all 
a£{l,...,t}. Then, 

P{Xi >/3,...,Xi---Xt >/3*} =P{yi >l,...,Yi---Yt> 1} 

= P{Ya+l > l,...,Ya+l---Ya+t > 1} 

for all a E {1, . . . , t} since the variables are i.i.d. 

Define a G {1, . . . , t} as the first minimum of Yi • • • Y^. Then Yi- ■ - Yt > 1 implies that for 
an6e{l,...,t}, 

Yi---Ya+b 



Ya+l ■■■Y 



a+b 



Yl---Ya 



> 1. 
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If a + b < t, the inequality holds by our choice of o. For a + b > t, it can be seen by writing 
Yi--- Ya+b = Yi---YfYi--- Ya+b-t and using that Yi---Yt>l. Thus, 

t 

[Yl---Yt > 1] C J [Ya+l > l,...,Ya+l---Ya+t > 1]. 

a=l 

So we have 

F{Yi---Yt > l}<t-P{Yi >l,...,Yi---Yt>l}. □ 

Proof of the lower bound. For convenience of notation, the nodes of the tree are labeled 
from to 3n, and we shall study the height Hsn- For a node x G {2n + 1, . . . , 3n}, t G N and 
< /3 < 1, define the event 

(13) A^^tW) = [L{x, 1) > n/3, L{x, 2) > n(3^, L{x, t) > n/3*] . 

We set A^fi{/3) = [L{x,0) > n/3°] = [x > n] so that P {A^^o{/3)} = 1. Note that when (3 is 
clear from the context, we just write A^^t for Ax^til^)- 

Lemma 3. Assume /1{X) is not a single mass. Let c E Omax 

such that ^(c) + 6 < 1 and ^{c) — 6 > 0. Then there exists to = ^o(c, 5, C{X)) such that for 
all integers t >to, n> and 2n + 1 < x < 3n, 

ot o(*{c)+<5)i 

^<^^<P{A.,(/3)}</3(*W-^)*. 

Proof. First, using Proposition [5] in AppendixjBj we know that < ^'(c) < 1 for c G (1/^, amax)- 
So we can choose (5 > with ^(c) + 5 <1 and ^'(c) — (5 > 0. 

We start with the upper bound. Using the same computation as in the previous section, 
P {L{x, t) > up'} < P {3nXi(,,o) • • • ^L(x,i-i) > n^'} 
= P {3/3~*Xi(^^o) • • • XL{x,t-i) > 1} 
< inf exp (x{-t log /3 + log 3) + A{X)t 

1 log 3 



exp —tA 



By definition of ^, we have A* (— 1/c) = ^{c)/c. Thus for t large enough, by continuity of A*, 
A* (-1/c - (log3)/t) > (^(c) - 6)/c. Thus, 

P{L(x,t) >n/3*} < exp(-t(^(c) -5)/c) = /3(*(^)-^)*. 

To prove a lower bound on the probability of Ax^t, we use that for all s G {1, . . . ,t} 

[Lix, s) > n(3'] D [2nX^x,o) ■ ■ ■ ^L(x,s-i) - s > n^] 



5 [^L{x,0) ■ • • Xl{x,s-1) > Z^"*] • 

The last inclusion holds because we assumed n > tf3~^ > s/3~^ for all s < t. Thus, we write 
P {Ax,t} = P {L{x, 1) > n/3, L{x, 2) > n/?^, . . . , L{x, t) > 

> P {-'^L{i',0) > /3, X^xfl)^L(x,l) > • • • , -'^L(j;,0) " " " -'^L(z,i-1) > • 
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We now use Lemma [2] to get 

P {A.,t] > {^L(x,0) • • • ^L(x,t-1) > . 

Using Corollary [l] of Cramer's theorem, 

P {Xl{x,o) ■ ■ ■ ^L{x,t-i) > = P |Xl(^,o) • • • ^L(x,t-i) > e"*/""! 

= exp (^-tA* (-1/c) + o{t) 

But A* (-1/c) = ^(c)/c < {^{c) + 6)/c. So for t large enough, 

P {X^(,,o) • • • ^L(x,t-i) > /3*} > exp (-(^(c) + 6)t/c) = 

As a result 

fl(1'{c)+5)t 

P{^x.,t}>^^^ □ 

Theorem |3] is proven using the second moment method on the number of nodes that have 
a large depth. 

Lemma 4. Let X have an atom of weight p at for some p £ [0, 1), and a density bounded by 
K, of total mass 1 — p, on (0, 1). Let x ^ y be elements of {2n + 1, . . . , 3n}, let t be a positive 
integer and let j3 G (0, 1). Then 

P {A^^t n Ay^t] <Y.^ {^x, J P {Ay,s] ^^^tll^ + P {A,,t] P {Ay,t] . 



Proof. If f is a node of a SARRT, let Pt{v) = {L{v, 0), L(f , 1), . . . , L{v, t)} be the first t + 1 
elements of the (random) path connecting x to the root of the tree. Given x and y, define 
T = +00 if Pt{x) n Pt{y) = 0, otherwise set T to be the minimum non-negative s such that 
L(y,s + 1) G Pt{x). Then 

t-i 

P {A^^t n Ay^t] = P {T = s, A^^t n Ay^t] + P {T = +00, A^^t n Ay^t] ■ 

s=0 

In order to evaluate this expression, we fix the path Pt{x) from x to its t-th ancestor. 
Let F = {Q C {0, ...,3n} : x = maxQ, \Q\ < t} be the set of possible paths. For all 
s E {0,...,t-l} 

P {T = s, ^^,f n Ay^t} = P {T = s, n Pt{x) = Q} 
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where lji^^(Q) is the indicator of the event Ax^t when Pt{x) = Q. As the event Ax^t is 
completely determined by the path Pt{x), lA^^tiQ) is deterministic. 

P{T = s,Ax,tnAy,t} 

< Yl ^A^AQ) P n g = 0, L{y, s + l)eQ, Ay,„ Pt{x) = Q] 

= i;iA..*(g) ^{Ps{y)r^Q = $,L{y,s) = u,[uXu\(.Q,Ay^s,Pt{x) = Q}. 

In order to simplify this expression, we use the independence claim below. 

Claim. For any Q C {0, . . . , 3n} and u ^ Q, the events [Ps{y) H Q = 0, L{y, s) = u, Ay^g], 
[[uXu\ G Q] and [Pt{x) = Q] are mutually independent. 

Proof. We show that the three events live in independent sigma-algebras. Recall that an event 
E is said to be in the sigma-algcbra generated by a random variable Y when knowing the 
value of Y determines whether E holds or not. 

(i) [Ps{y) n Q = 0, L{y, s) = u, Ay^g] is in the sigma-algebra generated by {X^ : w ^ QjW ^ 
u}. In fact, starting at y, it is possible to determine the path of length s starting at y until 
it reaches a node in QU{u}. If any node in Q is reached before s steps, then [Ps{y) fl Q = 0] 
cannot hold. Moreover, if node u is reached before s, [L{y, s) = u] cannot hold because 
u is not the root and the attachment distribution C{X) is smaller than 1. Otherwise, 
knowing the path Ps{y), it is easy to determine whether [Psiy) n Q = 0, L(y, s) = u, Ay^g] 
holds or not. 

(ii) [[mX„J G Q] is in the sigma-algebra generated by Xu- 

(iii) [Pt{x) = Q] is in the sigma-algebra generated by {X^, : w G Q}, using an argument 
similar to (i). 

We conclude by recalling that the random variables Xo,Xi, . . . , X^n are independent. □ 



It follows that 
P{T = s,Ax,tnAy,t} 



<J2^^^AQ) E ^{Ps{y)r^Q = ^,L{y,s) = u,Ay,s}P{Pt{x) = Q}P{luXu\eQ} 



QeT u. 



< E ^A.AQ)^{Pt{x) = Q} P{Ay,s}(.t + l) sup P{[uX^\=w} 

' ' u: u>n/3'' 

w. w>nj3* 



= P {Ax,t} P {Ay,s} {t + 1) sup P { [uXu\ = w} . 

u: u>n/3* 
w. w>n^* 

The last inequality holds because when the event A^^t holds, all nodes in Pt{x) have a label 
at least n/3*. In order to bound the collision probability P {[uX^J = w}, we first notice that 
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w; > 0. So we can use the fact that conditioned on X > 0, X has a density bounded by k: 

P{[uXu\ =w}<p{x^e 



W W + 1\\ K 
< -. 



U J ) u 

Thus, 

P {T = s, A,,t n Ay^t] < P {A,,t} P {Ay,s} 
Repeating the above argument for T = +oo, we get 

P {T = +00, A,,t n Ay^t} < Yl 1^^.* (Q) P n Q = 0, , i^t(x) = Q} 

< [ j;iA,,(Q)Pm(x) = Q})p{^,,a 

= P{A^^t}'P{Ay,t}. □ 

Theorem 3. -Lei there exist p G [0, 1] such that with probability p, X has an atom at 0, and 
with probability 1 — p, X has a bounded density on [0, 1). The height Hn of a sarrt with 
attachment X satisfies 

Hn V 

> Omax as n oo, 

log 77. 

where Omax is defined in equation ([g]). 

Proof. If the atom at has probabihty 1, then Hn = 1 and amax = 0. In the rest of the 
proof, we assume that X is not a single mass. Fix 5 G (0, 1/2), e G (0, 1) with 36 < e and 
c G (I/a*) Omax)- Define /3 = e"^/*^ and t = [{1 — e)clognJ . Our objective is to show that 

hm P{H3n>t} = 1. 

n— >oo 

For this we consider the event 

3n 

u 

.x=2n+l 

where the events A^^t are defined in equation (13). The fact that A^^t holds implies that 
L{x,t) > n/3* > n/n^~'' = > ^, i.e., the depth of node x is at least t. A lower bound on the 



probability is given by the following second moment inequality [5]: 
r 3n 

(14) P U ^ 



. 2 

3n 



^x=2n+l 



> 



The symbol Ex=/=y ^^ed instead of X]z=2n+i J2l=2n+i,y=/=x t° ^^^p the notation light. Let 
io(c, S,£,{X)) be defined as in Lemmajs} When n is large enough, the conditions t > to and 
n > t/3~* are met. So Lemma [s] gives 

(15) P{.4,,)>^> J^. 

Now, fixing x ^ y, we have by Lemma |4j 

P {Ax,t n A,,a < ^ P {Ax^t} P {^,, J ^^^^ + P {A.,t} P {^,,t} . 
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For s > to, we apply Lemma [3] to find an upper bound on P {^4^ s}: 



P {A., n A,,} < P {A,,} (£ + Y. + P {A,,} ) 

\s=0 ^ s=to ^ / 



t-1 



<P{A.,t} 



n 13' 



1 



n 



s=0 



^^^^ ^^^"'^j v^y ' " /3(*w-^-i)-i 

We now show that the dominating term is P {^x,t} P {■Ay,t\- Using inequahty (15), 

t/n 



< 



n 



O [n-") 



as t = O(logn). Moreover, using the more precise lower bound on P {Ay^t} given in Lemmajsj 

^^{*(c)-<5-l)t ^2^(vI'(c)-<5-l)t^-{*(c)+5)t ^2^^-t^)25^-t 

nP {Ay^t} ~ n n 

By definition of t, /3~* < n"*^"^, and thus 



(18) 



Plugging inequalities ( |17| ) and (18) into (16), we get 

P n Ay,t} < P P {Ay,t} (l + O 



Taking the sum over all nodes x ^ y with x,y £ {2n + 1, . . . , 3n}, we obtain 

2 



/ 3n \ ^ 

^P{yl,,tn.4,,a< P{A,,a (l + O 

Xy^y \a;=2n+l / 



x=/=y 

Moreover, using inequality ( |15| ), we have 

E P{^.,a>^7;;T 

x=2n+l 



n 



rr 



Thus, plugging these bounds in (14), we get 

r 3n ^ 

p^ u 



> 



ET=2n^iP {Ax,t}) >l + 0(n-/4) 
> l-0( n~'/^] - O itn 



-1 



^x=2n+l 



This shows that 

(19) P{i^3„>t} = P{^3n> L(l-e)clognJ} > l-0(n-^/4) . 

We conclude that for any e > 0, 

lim P{i^„ > (1 - e)amaxlogn} = 1. 

Combining this with the upper bound proved in Lemma [T| we get the desired result. □ 
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3.3. Attachment distribution with unbounded density. In order to handle attachment 
distributions X having unbounded densities, the next lemma shows that we can approximate 
X by X^ that has bounded density and an atom at 0. 

Lemma 5. Assume that X G [0, 1) has a density, and let z > — /i he such that A* [z) < +cxd. 
Then for all 6 > 0, there exists Xs < X such that C{Xs) has a bounded density and an atom 
at 0, such that 

A*{z) < A*s{z) < A*{z) + 6 



where Ag is defined as in (11) for Xs ■ 



Proof. The constants r],b > will be chosen later. Let / be the density of C{X) and define the 
event A = [f{X) > b]. Take b be such that P {A} < r]. Define Xs = 01a + XI a'^. We have 

AUz) = snp\Xz-logB\x^}} 

A>0 L J J 

= -loginf |e-^^E|(OlA + ^lAO^}} 

= -loginf |e-^^E|oiA + X^lA-}} 

= -logi„f{e'-(E{x>}-E{x^l.})}^ 

Note that the expression Az— logE {Xg^ is understood to evaluate to — log(l— P {A}) for A = 
as in equation (11). Trivially, we first get Ag(z) > A*{z). Moreover, using Cauchy-Schwarz 
inequality, 



E {xHa} < ^E{X2A}yp{A} < y^E{X2A}^. 



Thus, 



A-*s(z) < - log inf <! e-^'ElX^ 

A>0 



< 



I _ ^e-2A^E{X2A}^ 

log (^inf |e-^^E|x^}} - v^inf |^e-2A^E{X2A} 

log (e"^*W_^e-^*(^)/2 



= A* (z) - log (^1 - ^r/e^*W^ . 
By choosing 77 so that log (1 — \/ r/e^*(^) ) < 6, we obtain the desired result. □ 



We can now restate the theorem for any density. 

Theorem 4. Let there exist p E [0, 1] such that with probability p, X has an atom at 0, and 
with probability 1 — p, X has a bounded density on [0, 1). The height Hn of a SARRT with 
attachment X satisfies 

Hn V 

> "max as 00, 

log n 

where ctmax is defined in equation 
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Proof. If the atom has probabihty 1, then Theorem [3] can be apphed. In the rest of the proof, 
we assume that the atom at has weight less than one. Since Lemma [T] does not have any 
restrictions on the distribution C{X), we have for any e > 0, 

hm P {Hn > (omax + log n} = 0. 

n— ^-oo 

For the lower bound, we use Theorem |3] via the transformation defined in Lemma [5j Let 
e > and pick 6 > small enough so that ^'(aniax — ^) + amax<5 < 1- This is possible because 
^(«max — e) < 1 (Proposition [5] in Appendix [B|) . Then define Xs as in Lemma [sj so that 
A*{z) < A|(z) < A*{z) + 6. Define a tree Tn with a sequence Xq, . . . ,Xn of independent 
random variables distributed as Xs- Using Theorem [s] for the tree T„, we get in particular a 
lower bound on its height Hn- 



hm P\Hn< (omax - e) log n I = 



where Omax = inf |c : c > ^ and ^'^(c) > l| and ^'^(c) = cA^(— 1/c). Recall that Xg as 



obtained from Lemma [5j satisfies Xg < X, which implies that Hn is stochastically not larger 
than Hn- Thus, 

hm P {Hn < (Smax " e) log n} = 0. 



Qr, 



Next, if ^' is the function defined in ([s]) for the (original) random variable X and 
inf |c : c > ^ and ^(c) > l|, we have by construction of Xg, 

*(amax - e) < ^'^(amax " < ^("max " + amax<5 < 1. 

As a result, by definition of 5max! we have 

Q^max ^ Ctmax ^ 

so that 

lim P {Hn < (amax " 2e) log n} = 0. □ 

3.4. Almost sure convergence and convergence in mean. Using Proposition [2] below 
and the explicit probability bounds given in the proofs of Lemma [T] equation ([s]) and Theorem 
equation (19), we get lim„^oo = Omax almost surely as stated above in Theorem j2j We 



lould mention that Pittel [26 ] also proved almost sure convergence of the height for the URRT. 



Proposition 2. Let Hn be a non- decreasing sequence of random variables and let a > be 
such that for all e > 0, 

P{Hn>ia + e)logn} = o(-^] and P {Hn < (a - e)logn} = O ^ 



log n J \ log n 



Then, with probability 1, 



H 



lim ; — — = a. 



n—^oo 



logn 
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Proof. Let 7 > 3 be an integer. We consider the maxima of the sequence for n in intervals 
of the form [7^ , 7'^'^+^) ] for positive integers k. For e > 0, we have 



max > [a + e 



j<p{/7^(,+,)2>(a + e)log7'='} 

< P {//^(.+i)2 > (a + e) ((fe + 1)2 fog 7 - (2fc + 1) log 7) } 
= p|i7^(,+,). >(a + e)log(7('^+i)') (l 



2k + 1 
(A: + 1)2 

"^(fo^Tw) "^(p 

Using the Borel-Cantelli lemma, there exists /cq such that, max^^^^fco < a + e with 
probability 1. Similarly, 



mm 



[7'=2<n<7('=+i)^ log' 

Thus, there exists no such that for n > no, a — e < j^^^^ < a + e almost surely. □ 

implies the convergence of the sequence ■ 

Theorem 5. Let there exist p E [0, 1] such that with probability p, X has an atom at 0, and 
with probability 1 — p, X has a bounded density on [0, 1). The height Hn of a sarrt with 
attachment X satisfies 

hm — = Omax, 

n-5>oo log n 

where ctmax is defined in equation 
Proof. For any e > 0, 

E {Hn} > (Omax - e) fog n • P {Hn > (Omax - e) fog n} . 

Taking the limit as n — t- 00 and observing that the inequality holds for any e > 0, 

. ^ B{Hn} ^ 
hminf — > Omax- 

n^oo log n 

For the upper bound, fix e > 0. We have 

00 

E{i7„} < (Qmax + e)logn + 2+ ^ P {^^n > 

t=r("max+£) logn+2] 
00 

< ("max + e) log n + log n • ^ P {Hn > (Omax + £ + i) fog n + 2} . 

i=0 

The bound in equation ^ of Lemma [T] gives 

P {Hn > (an,ax + £ + i) fog n + 2} < +^+*). 
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But using the monotonicity of A* (Proposition |3]) , 

*(amax + e + i) = (Omax + £ + i)^* 



1 



1 



> (Omax + e + i)^* 
Q^max 

In the last inequahty, we used the definition of Omax (equation ([6])). Thus, 



E {Hn} < (amax + ff) log n + 2 + (log n) ■ ni-*(°--+") + log n • ^ 

Finally, 



oo 

i/an 



n 

i=l 



E {Hn} 

lim sup — < amax + £• □ 

n->oo log n 

4. The minimum depth 

In the previous section, we considered the maximum depth or height of a tree. In this 
section, we study the minimum depth. Observe that considering the minimum depth over all 
the nodes is not interesting: mino<i<n Di = Dq = 0. Instead, we define the minimum depth by 
Mn = min„/2<j<n ^i- The reader will be easily convinced that the results remain unchanged 
if we consider min5„<j<„ Di for some 5 € (0, 1). 

The objective of this section is to show that [^|^ — >• amin almost surely where 

if [0,l//i)n{c: ^(c) > 1} = 
< c < ^ and "^{c) > l} otherwise 

and ^' is defined as in equation ([s]) in Section [sj Note that if /i = E{— logX} = +oo, then 
Omin = 0, and [^1^ ^ using Theorem jlj In the sequel, we assume fj, < +oo. In this case, 
provided that X is not constant. Proposition [5] in Appendix [B] implies that Omin < The 
following theorem sums up the results we prove in this section. 

Theorem 6. The minimum depth Mn of a SARRT with attachment X having a density satisfies 




(20) amin= < . / 1 



logn 



an 



where amin defined in equation (20). 



Remark. If X = a G [0, 1) with probability 1, then Omin = I/a* = — 1/loga and it is easy to 
see that the results of the theorem also hold in this case. 

The proof of Theorem |6] follows the same general idea as for the height with some 
complications for the upper bound. A lower bound on M„ similar to the upper bound for the 



height (Section 3.1 ) is given in n ext section. The proof of the upper bound is more delicate 

Observe that does not converge almost surely as there 



4.2 



and it is the topic of Section 
are nodes with arbitrarily large 



log n 

abels that choose the root parent. 
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4.1. The minimum depth: lower bound. 

Lemma 6. For any c < amin; we have P {M^ < clogn} — )• 0. 

Proof. If Oniin = 0, then the lemma clearly holds. For amin > 0, a calculation similar to that 
of Lemma [T] shows that 

/ n \ "^^'^^ 

P{Dn < LclognJ} < — 1 

V 1 + [c log nj J 

using the definition of ^ (equation ([s])). By applying a union bound, we get a lower bound on 
the shortest path: 

P{M„ < [clognj} = P i min A < [clogn\ 

\^n/2<i<n 

< nP{Z)Ln/2j < LclognJ} 
O I n ' 



_ log n ^ 

because ^'(c) > 1 for c < amin- D 

4.2. The minimum depth: upper bound. In this section, we introduce the possibility for 
X to have an atom at +oo. This is needed only to take care of attachment distributions that 
have unbounded densities. A node x for which X^ = +oo is attached to an imaginary node 
at +00, that does not have any ancestor, so that L(x,s) = +oo for all s > 1. Even though 
such a choice of X does not fit in our definition of a sarrt, it is only used as an auxiliary 
construction, and it is still possible to define all the quantities that are based on X. We define 
A* for a random variable log A that has an atom at +oo as in the case of an atom at — oo (see 
equation ( 11 )): 

A*{z) = max|sup|Az-logE|e^^°sx}} ,-log(l -P{A = +oo})| 

for all z < EjlogA}. The function ^' is defined as in equation ([s]). We can then prove a 
statement analogous to Corollary [T] which we state below. 

Corollary 2. Let X have an atom at +oo with mass p £ [0, 1) and any distribution on (0, 1) 
with total mass 1 —p such that EjlogA} is well-defined. Let Xi, . . . , Xt be i.i.d. random 
variables distributed as X. Then, 

P { Ai . . . A, < = exp (-tA* (a) + o{t)) {{"''''^^ ^ ^ {!°^ ^ < 

>■ ^ V ^ ^ ^ 'J \ for a€R ifE {log A} = +oo. 



Recall that for the height, we defined the event Ax^t (equation (|13|)) which captures the 
idea that the path up to the root originating from x keeps large enough labels. By analogy, 
the corresponding event B^^t for the minimum depth is to have a path whose labels stay small 
in all steps. Given a design parameter f3 £ (0, 1), 

(21) Bx,t{f3) = [L{x, 1) < 2n/3, L(x, 2) < 2n/3^ . . . , L{x, t) < 2n/3*] . 

The following lemma gives a bound on the probability of the event B^^t assuming that A 
has a bounded density and an atom at +oo. The proof is based on a rotation argument and is 
similar to that of Lemma [3] with some minor modifications. Hence, we omit it to shorten the 
presentation. 
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Lemma 7. Let X have an atom of weight p G [0, 1) at +oo, and any distribution, of total 
mass 1 — p, on (0, 1). Moreover, assume fi = E{— logX} is well-defined and not +oo. Define 
9 = +00 i/E{— logX} = — oo (equivalently, ifp > 0) and 6 = 1//U otherwise. Let c G {amin,S), 
(3 = e~^l'^ and 6 > such that ^(c) + 6 < 1. Then there exists to = to(c, C{X)) such that 
for all integers t >tQ, n > tf3~'' and n + 1 < x < 2n, 



Next, we prove that there is enough independence between the events B^^t to ahow us 



to use the second moment method. In the context of the study of the height (Section 3.2), 
this is done for the events Ax^t in Lemma |4] where the probabiUty of the event [Ax^t H Ay^t] 
is bounded by estimating the probabihty of coUisions. To obtain such a bound for the event 
[Bx^t n By^t] ) the main difference is that we condition on the different intervals of labels where 
the collision might take place instead of the collision time T. This is because, unlike the event 
Ax^t which gives a lower bound on the labels of the nodes in the path from node x to the root, 
the event Bx^t only implies an upper bound on the labels. Being able to bound from below the 
node labels is important to bound the collision probability. 

Lemma 8. Let X have an atom of weight p € [0, 1) at +oo, and a density hounded by k, of 
total mass 1 — p, on (0, 1). Let x ^ y be elements of {n + 1, . . . , 2n}, let t be a positive integer 
and let f3 € (0, 1). Then 

P {Bx,t n By^t} < ^ P {Bx,t} P {By,s-i} ^^^0^ + P {Bx,t} P {By,t} . 

Proof. We consider the collision time T when the path starting at y meets the path of 
X. Define T = +oo if Pt{x) n Pt{y) = and T = min{s > : L{y,s + 1) € Ptix)} 
otherwise. We introduce the random variables T[x,i) = min{s > : L{x,s) < 2n/3*}. We 
have [T{x,s) < s] = [L{x,s) < 2n/3*] for every s. In order to be able to bound collisions, 
instead of conditioning on a fixed value of T we condition on T being in some interval 
Is = [T{x, s — 1), T(x, s)) or loo = [T{x, t), +oo) . If T G J^, then we know that the collision 
happened between n(3^ and nP^~^^. 

t 

P {Bx,t n By^t] = P {T G h,Bx,t n By^t] + P {T G Ioo,Bx,t n By^t} ■ 

s=l 

In order to evaluate this expression, we fix the path Pt{x) from x to its t-th ancestor and 
average over all possible paths in = {Q C {0, . . . , 3n} : x = maxQ, \Q\ < t}. We have 

P{Teis,Bx,tnBy^t} 

= J] J] P {T = ^, ^ G I„ Bx,t n By^t, Ptix) = Q} 
t-1 

^ E E Is-.* (^) P {^^(2/) n Q = 0, L(y, £ + 1) G Q, £ G By^s-i,Pt(x) = Q] 

Q&F £=0 
t-1 

= Y.Y.'^bUQ) ^{Pi{y)^Q = ^.L{y,l) = u,[uXu\(^QA£ls,By^s-i,Pt{x) = Q] 
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In order to simplify this expression, we use the independence claim below. 

Claim. For any Q C {0, . . . , 2n}, u^Q and £ G N, the events [[uXu\ G Q], [Pt{x) = Q] and 
E =^ [PeXv) n Q = 0, L{y, i) = u,£ € Is, By^g-i] are mutually independent. 

Proof. As in Lemma |4| the event [[nX^tJ G Q] is in the sigma-algebra generated by Xu and 
[Pt{x) = Q] is in the sigma-algebra generated by {X^ ■ w G Q}. So we only show that E is in 
the sigma-algebra generated by {X^j : w ^ Q,w ^ u}. 

By looking just at variables from {X^j : w ^ Q,w ^ w}, it is possible to determine the 
path of length £ starting at y until it reaches a node in Q U {u}- If any node in Q is reached 
before I steps, then [Pi{y) n Q = 0] cannot hold. Moreover, if node u is reached before I steps, 
[L(y,£) = u\ cannot hold. Otherwise, knowing the path Pe{y), it is easy to determine whether 
£ € Is- If in fact i € Ig, we know that T(y, s — 1) < i. So either £ > s — 1 in which case we 
can clearly determine if By^s-i holds, or £ < s — 1 but then rewriting By^s-i as 

By^s-i = [T(i, 1) < 1, T{i, 2) < 2, . . . , T{i, s - 1) < s - 1], 

we can see that it is possible to determine whether By ^s-i holds or not. □ 



It follows that 

P{T (^Is,B^^tr\By^t} 

<Y.Y.^B.AQ) E ^{E]'P{Pt{x) = Q]¥{[uXu\eQ} 



^ E E ^BUQ)^{Pt{x) = Q} I P{5,,.-i}(t + l) sup^^ V{[uX^\=w] 
£=0 \QeT 

= P {B^^t} P {By,s-i} t{t + 1) sup P { [uXu\ = w} 



u: u>nf}" 
w: to<+oo 



u: u>nlz 
w: wK+oo 

We can assume that Q does not contain the node -|-oo because otherwise Bx^t does not hold. 
Thus we can use the bound k on the density to get 

P {T G /„ B.,,t n By,t} < P {Sx.,t} P {By,s-i} 

Observing that the above argument can be repeated for T G loo, we get 

P {T G loo, B,^t n By^t} < P {B,,t} P {By,t} ■ □ 

We omit the proof of the next lemma as it is similar to the proof of Lemma [5j 

Lemma 9. Assume that X G [0, 1) has a density and E{— logX} < -|-oo, and let z < —/i 
be such that A* (z) < +oo. Then for all 6 > 0, there exists Xg > X such that C{Xs) has a 
bounded density and an atom at +oo, such that ^{logXg} is well-defined and 

A*{z) < A}iz) < A*{z) + 6. 
We can now prove the main theorem of this section. 
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Theorem [6] (Restated). The minimum depth of a sarrt with attachment X having a 
density, bounded or not, satisfies 

Mn V 



logn 



Omin 0,S n — )• OO, 



where amin is defined in equation (20). 



Proof. Let c G (amin, 1/^) and pick e so that < 1 — ^(c) (recall that ^ = E{— logX} > 
and that we can assume ^ < -\-oo). In order to handle the case where X has an unbounded 
density, we define (using Lemma |9]) an auxiliary random variable X^ > X with an atom at +oo 
and a density on (0, 1) bounded by k = K{e) such that for all z < such that h*{z) < +oo, 
we have 

A*(z) < A*(z) < A*(z) + e. 

Define *£(c) = cA*(-l/c) and Smin = sup {0}U{c: cGM+ and '^e{c) > !}• By the choice 
of c and e, 

*(c) < ^e(c) < ^'(c) + ce < ^-(0) + e/i-i < 1 

so that C > 5min- 

Consider a sequence of independent random variables Xq , . . . , X2n distributed as X^ , 
constructed as in Lemma [o] so that Xi < Xi for all 1 < i < 2n. We can define the associated 
ancestor labels L{x, s) and events Bx^s for any x £ {0, . . . , 2n} and s > 1. Because Xi < Xi 
for every 1 < i < 2n we have for all t > 1 and (3 £ (0, 1), 



2n ^ r 2n ^ 

U Bx,t{P)\>p\ u B.,tm\ 

=n+l ) Kx=n+1 J 



To prove that P |u^"„_,_]^i?x,i(/?)| approaches 1 as n — )• c«, we proceed in a similar way as 
Theorem [sj Fix 6 G (0, 1/2) with 36 < e, /S = e'^/" and t = [{1 - e)clognJ . We have 



m 



(22) P U 

I. x=n+l 



2n 

x,t 



P {Bx,t] + Ex^y P {Bx,t n By,t} 

First, as c < 5min, we can use Lemma [7| 

p{Bx,t} >-> 



t ~ t 

Then, using Lemma [Sl we get 



{Bx, n By,} < ^ P [Bx,] P [By^s-i] + P {Bx.^ P {By,} . 

s=l 



t{t+l)K 

Let to be defined as in Lemma [7} A calculation similar to the one in the proof of Theorem [3] 
gives: 

P{^.n5.} (o (i) . ^ . .P{g4) . 

We end up with 

'Bx,t n < P {Bx,t} P (1 + 
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Thus, going back to equation (22), we obtain 
P 



2n ( 2n 



ka;=n+l ^ \x=n+l 

When the event B^^t holds, L(x,t) < 2n/3* < 2n • e^^'^n~^~^^ < 2e^/^rf , i.e., the length of 
the path from x to a node whose label is no larger than 2e^^^rf is at most t. But using the 
upper bound on the height of a sarrt (Section |3.1[ ), we know that the depth of a node labeled 
m is at most 2amaxlogm with high probability (recall that amax < +oo). In fact, 

P{M2n > c log n + 2eamax log n] 

< P {Man > t + 2eamax log n} 

^ I 1 - P I U ^^-.il I + P I ™ A > 2eamaxlogn| 

We conclude that for any e > 0, 

P{M„ < (l + e)aminlogn} ^ 1. 
Combining this with the upper bound proved in Lemma [6j we get the desired result. □ 

5. Applications 

Giving X the uniform [0, 1) density provides a new elementary proof for the height of 
the urrt that avoids any mention of branching processes as has been done by Devroye [9j or 
Pittel |26j . Note that Cramer's Theorem is not needed in this case. Instead, Proposition [T] can 
be directly proven in this case using properties of the gamma distribution. 

Moreover, setting X = max([/i, . . . , [7^) and X = min({7i, . . . , [7^), we can compute 
asymptotics for greedy distances introduced in Devroye and Janson [10]. A random /c-DAG (or 
urrt) is a directed graph defined as follows. For each node i = 1, . . . , n, a random set of k 
parents is picked with replacement uniformly from among the previous nodes {0, . . . , i — 1} 
and the root is still labeled 0. A node of the graph has many paths going to the root. One can 
define many distances. Some aspects of the longest path distance were studied in Arya et al. 
[2], Tsukiji and Xhafa ^32j and the shortest path distance in Devroye and Janson ^lOj. Moreover, 
the authors of [10] introduced two other distances defined by picking the path to the root 
following the smallest or largest labels. For instance, if one chooses the parent with the smallest 
label, this label is distributed as min([nC7iJ , [nC/aJ YnUk\) = [nmin(f7i, . . . , Uk)\ ■ As a 
result, these distances can be studied in the framework introduced in this paper. We define 
R~ and Rf to be the distance from node i to the root following these minimum and maximum 
label paths. These distances can also be seen as the depths of node z in a urrt where each 
node is given a choice of k independent parents. The random variable R'^ corresponds to the 
choice of the parent with the smallest label (oldest node) and Rf corresponds to the choice of 
the newest parent. 

Let Xjnax = max(C/i , . . . ,Uk)- Then, by Theorems [Tl [2] and [gI 



V + , , ii+ - A; log n £ 
■ p^ = k and ^ TV (0,1), 



logn ^ k log n 
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and 

maxi<j<„R+ +1x1 1 minn/2<i<n V + 

iim = Pmax almost surely, and > pL;„, 

n^oo logn logn "'"^ 

where pj^jj^ and p^j^^ are defined as the solutions respectively smaller and larger than k of the 
equation — c + k — clog f = 1- Some numerical approximations generated using a program are 
shown in Table [ij It should be noted that the concentration for i?+ as well as for R~ presented 
below were shown in Devroye and Janson |10] and Mahmoud [21], and the corresponding 
central limit theorems in Mahmoud 1211. 



E{-logXmax} 

Var{-logXinax} 



We give expressions for the relevant functions introduced in the proof: 

1 

1 

A(A) = - log (l + ^) ' (for A > -k) 

A*{z) = -1 - kz - log{-kz), (for z < 0) 

k 

^'(c) = —c + k — clog -. 

c 

Similarily, let X^nm = min(C/i, ...,Uk), then setting = Yli=i ? and h^-^ = Yli=i Ji, 

>p = -r- and , ^7^/^(0,1), 

logn hk 

y^logn 

and 

maxi<j<„i?r _ , ^ , , min„ /2<i<„ i?j p _ 

hm = p^^^ almost surely, and > p ■ 

n^oo logn logn 

where and are defined as the solutions respectively smaller and larger than 1/hk 
of the equation ^'(c) = 1. See Table [l] for numerical approximations of these constants for 
different values of k. 

An expression for ^ and other relevant functions are given for X^i^: 
E{-logXmin} = hk, 
Var{-logX^in} = /if , 

k 

A(A) = ->^log(l + -), (for A > -A:) 



E.o.(i4) 

Xl{z)z + ^ log [l + ^] , (for z < 0) 

i=i ^ ^ ^ 



A*(z 

i=l ^ 



where A^(z) is the solution of z + Yli=i 



i=l 

1 

l+\l{z)/i 
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Table 1. Approximate numerical values for some constants 



k 






Pmax 


Pram 


p 


Pmax 


1 





1 


e 





1 


e 


2 


0.3734 


2 


4.3111 





0.6667 


1.6738 


3 


0.9137 


3 


5.7640 





0.5455 


1.3025 


4 


1.5296 


4 


7.1451 





0.4800 


1.1060 


5 


2.1925 


5 


8.4805 





0.4380 


0.9818 



Remark. This of course can be repeated for fc-DAGs where the parents of node n are indepen- 
dent and distributed as \nX\ where X £ [0, 1) (sarrd) and C{X) has any density. 



6. Concluding remarks 

To compute the height of the tree, our proof uses the existence of a density for C{X) in 
order to bound the collision probability. The existence of a density is only used to find a lower 
bound on the height. The upper bound given here (Lemma [T]) works for any distribution. 
It is natural to ask whether this upper bound is tight for a larger family of distributions, 
for example when C{X) has atoms. Atoms at are handled by our proof. Note that for a 
deterministic X = 9 G (0, 1), the height of the tree, which is simply the depth of node n, is 
(1 + o(l)) iog^i"g - For example, if = ^ for an integer m > 2, the tree is a complete m-ary tree. 

One can construct a random A;-DAG or SARRD in the same way. Node n chooses k 
parents [nX^-*^^] , [nX^^^J , . . . , [nX^'^'^J where X^^\ . . . , X^^'^ are independent copies of a ran- 
dom variable X G [0,1). The "greedy" distance measures can be computed simply by 
considering the sarrt with attachment random variable = mm{X^^\ . . . ,X^^^) and 

Xma,x = niax{X^^\ . . . ,X^^^). One could study the shortest and longest path distances in 
a SARRD, which has been done for the uniform case in Arya et al. [2], Devroye and Janson 
[TO], D'Souza et al. [TO], and Tsukiji and Xhafa [32]. 

Another point mentioned in Devroye and Janson [TO] is the relation between the SARRT 
model and random binary search trees (rbst). A REST can be constructed incrementally 
by choosing one of the n + 1 external node at random and replacing it by the node that 
arrives at time n. The (random) arrival time of the parent of n is roughly distributed as 
max([C/inJ , [C/2'^J)- This suggests that the depth of nodes in a rest and in a SARRT with 
attachment X = max(C/i,C/2) are related. Observe that the height of these two different 
types of random trees are the same up to lower order terms: — )• a where a w 4.3111 [8]. 
Considering a best-of- two-choices rest in which each new node n has two choices of keys, and 
chooses the one for which the parent arrived last. It would be interesting if the first order 
of the asymptotic height is the same for a best-of-two-choices rest and for an sarrt with 

X = min(max([/i,f/2),max(C/3,C/4)) = \/ 1 - VU whose limit ^ c where c w 2.364. If 
one picks the parent closest to the root, then the analysis seems to be even more challenging. 
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Appendix A. Some pictures of sarrts 

We include some pictures of SARRT for attachment random variable of the form for 
different values of /3 where U is uniform in [0, 1). We color the nodes from light (green) to 
dark (red) as a linear function of their labels. 

Note that for small values of /3, the attachment distribution concentrates more around 
1 and most of the nodes link to nodes of labels close to the bottom part of the tree. As f3 
becomes larger, the distribution is more concentrated near 0. The tree has a smaller height, 
and the root's degree increases. 




(c)/3 = 2 (d)/3 = 3 

Figure 1. sarrt with distribution and n = 500. 
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We prove some properties of Cramer's function A* as well as the function ^ both defined 



in Section 3.1 See |7] for more details on Cramer's theorem. Recall that A* is defined as: 
A*(z) = sup{Az-A(A)}, where A(A) = log E (e^'^j . 

Note that in our case Y = logX is a negative random variable, so A(A) < +00 for A > 0. 

Proposition 3. Let Y be a negative random variable with E{y} = —fi G [— oo,0). Then: 

(z) A* (z) G [0, +00] for all z €R. 
(a) If fi < +00, then —fj, G D/^* and A*{—fi) = 0. 
{Hi) A* (2) = sup;^>o {Az - A(A)} ifz>-fi. 

(iv) // A(A) < +00 for some A < 0, A* (z) = supx<o {^z — A(A)} for z < — //. 

(v) A* is decreasing on (—00,—^) and increasing on (— /x, +00). 

(vi) A* (z) > for z > -/i. 

(vii) A* is convex and thus continuous on the interior of {z : A* (z) < +cxd}. 
Proof. 

(i) A* (z) is non-negative for z G M: 

A*{z)>0-z- A(0) = 0. 

(ii) By concavity of the logarithm function, we have 

(23) A(A) =logE|e^^} > Ejloge^^} = AE{y} = -Xfi, 

using Jensen's inequality. As a result 

A* (n) =sup{A/i- A(A)} <0. 

A 

We conclude using the non-negativity of A* . 

(iii) If A(A) < -|-oo for some A < 0, then fi < +cxd. In fact, A(A) < +00 implies 



E{y} > E|e~^^} /A > -00 



by using the inequality Xz < e^^ for all reals A and z. It follows that if /i = +cxo, 
A(A) = +00 for all A < 0. In this case, the property trivially holds. For /x finite, z > —fi 
and A < 

Xz - A(A) < -Xfi - A(A) < A* i-n) = 0. 

(iv) As previously shown, we have /i < -|-oo in this case. For z < —fi and A > 0, 

Xz - A(A) < -Xfi - A(A) < A* i-fi) = 0. 

(v) For z > —fj., A* {z) = sup;j^>o {Az — A(A)}. This implies that A* is increasing on [— ^, +00) 
as z I— 7- A z — A(A) is an increasing function. Now if A(A) < -|-oo for some A < 0, then 
A* (z) = sup_;^<o {Az — A(A)} for z < — and similarly we get A* decreasing on (—00, — ^). 
Otherwise if A(A) = +00 for ah A < 0, then A*(z) = for ah z < -/i. 

(vi) For z > —fi, consider the function / : A 1— t- Az — A(A). As y is a negative random 
variable, this function is defined for all A > 0. Moreover it is differentiable and /'(A) = 
X — E {ye^^} /E {e^^}. Observe that /'(O) = z — fi > 0. As a result / is positive on a 
neighborhood of 0. As a result A* (z) = sup;^{/(A)} > on this interval. Now as A* is 
increasing, we get the desired result. 



28 LUC DEVROYE, OMAR FAWZI, AND NICOLAS FRAIMAN 

(vii) For G [0,1], 

9A* (zi) + (1 - ^)A* {z2) = sup {9Xzi - eA{X)} + sup {(1 - e)\z2 - (1 - 9)A{X)} 



> sup {X{ezi + (1 - 9)z2) - A(A)} 
Asm 

= A* {9zi + (1 - e)z2) . □ 

In the next proposition, another property of A* is introduced to prove that except in the 
case where C{Y) is a single mass, there exists z > — for which A* is finite. 

Proposition 4. Let Y be a negative random variable with E{y} = —fi G [— oo,0). If C{Y) is 
not a single mass, then there exists z > — // such that A*(z) < +oo. Moreover i/ A(A) < +oo 
for some A < 0, then there exists also z < —fi such that A*(2;) < +oo. 

Proof. We start by proving that A(A)/A is a strictly increasing function for A > 0. Writing 
X = e^, we have logEje^'*'} = logEjX^}. Let < A < A', and define g{x) = x^'/^ for 
X > 0. Then using Jensen's inequality for the convex function g: 

e{.Y^}"' = g (e {x^})'" < E {^(A-)}"*' = E (a-"}"'' . 

as X is not constant. By taking the logarithm 

A(A) A(AO 
A A' ■ 

Let zi = A(l). By the fact that A(A)/A is increasing, X{zi - A(A)/A) < for A > 1. Thus, 
A* (zi) = sup {A(zi - A(A)/A)} = sup {Azi - A(A)} < +oo. 

A>0 0<A<1 



Now, using equation ([23|, A(0.5)/0.5 > -fi. But zi = A(l)/1 > A(0.5)/0.5 > -^u. Finally, 
zi > -/i and A*(2;i) < +oo. 

As for the case z < — we start by observing that A(A)/A is a strictly decreasing function 
of A for A < using the same argument as above. Then if A(5) < +oo for some 6 < 0, let 
zs = A{5)/6. We have zs > A{0.56)/0.56 > Moreover, X{z5 - A(A)/A) < for A < 5. 
Thus, 

A* {zs) = sup {X{zs - A(A)/A)} < +00. □ 

(5<A<0 

Using these properties we prove the results needed for the function ^. 

Proposition 5. Let Y be a negative random variable with E{y} = — /i G [— oo,0). Define 
the function ^ by ^'(c) = cA* (-1/c) for c > 0. Let Vii, = {c> : ^'(c) < +oo}. Then, 

(i) The function ^ is continuous on the interior ofVx^,. It is decreasing on (0, and 

strictly increasing on +oo) nPiji. 

[a) The set {c > l//i : ^(c) > 1} is non-empty. Define 

amax = inf |c> - : -^{c) > 1 

Then Omax < +oo, and if C{X) is not a single mass, amax > 1/^- Moreover, for 
c G (l/zi, amax), then ^'(c) < 1. 
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[in) If fj, < +00, define 



amin = sup \0}u\c<-: ^'(c) > 1 



Then if C{X) is not a single mass, Qmin < I/m- Moreover, for c G (amin, 1//^); we have 
^(c) < 1. 

Proof 

(i) The continuity follows from the continuity of A*. For +00) nPijf, ^ is strictly 

increasing because A* is increasing and A* (z) > for z > —fj.. For (0, (1 Pij-, using 
the convexity of A*, we have for z < z' < in Di^,: 

A* (z) ^ A* {z') 



Thus, ^'(-l/z) > ^(-1/z') and ^' is decreasing on (0, nP^. 

(ii) Fix any z' £ (— ^, 0), then using the positivity of A*, A* {z') > and thus for c > —1/z', 

^(c) = cA*(-l/c) > cA*{z'). 

As a result, for c large enough ^'(c) > 1. This shows that Omax < +00. Moreover, if 
C{X) is not a single mass, then Proposition |4] and the continuity of A* imply that is 
smaller than 1 on an interval [l//i,c] for some c > 1//U. This shows that Omax > I/m- 

Furthermore, taking c < amax, by definition of Omax and as ^ is strictly increasing on 
(l///,+oo), ^-(0) < 1. 

(iii) First, if A(A) = +00 for all A < 0, then A* (z) = for all z < —fi. In this case, 
«min = < 1/// and ^'(c) = < 1 for all c € (amm, 

Assume now that A(A) < +00 for some A < 0. Then using Proposition [4| we have 
ctmin < It remains to show that for c G (amin, 1//^), ^'(c) < 1- Suppose for the sake 
of contradiction that this is not the case. Then there exists c > Omin such that ^(c) = 1. 
As ^' is a decreasing function in (amin, this implies that there exists z[ < Z2 < l/fJ- 
such that A* (z) = —z for all z G [z[, z'<^. But for z\ < Z2 in (z'l, Zg), we have 

^ [—)=TiV—2 — 

(24) < J sup { Azi - A( A) } + J sup { Xz2 - A(A) } 

^ A<0 ^ A<0 

Zl + ^^2 



So we must have equality in (24). This means that the suprema defining A*(2:i) and A* (2:2) 
are attained at the same point. We have A*(zi) = Xzi — A(A) and A* (2:2) = A2;2 — A(A) 
for some A < 0. Observing that A*(zi) — A*(2:2) = A(zi — Z2), we must have A = — 1. 
This implies that A*{zi) = -zi - A(-l) = -zi. But A(-l) = logE{X-i} > 0. This 
contradicts our assumption that ^'(c) = 1 for some c > amin- Note that we supposed 
here that for z £ {zi, Z2} there exists some A such that A* (z) = Xz — A(A). In the next 
paragraph, we show that we can suppose this is the case. 

Fix some z £ [zi, Z2\. We want to show that there exists a A < such that A* (z) = Xz — 
A(A). Consider Da = {A G M : A(A) < +00} and let o = inf 2?a- Suppose first a > —00, 
and consider the limit i = liuixia This limit exists because A is a decreasing function 

of A. If £ < +00, then by extending A by continuity. A* (z) = sup^^^^^lXz — A(A)} 
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SO we can assume that the supremum is attained. If i = -\-oo, then there exists ai 
such that A(A) > az for A < ai. Thus, we have A* (z) = sup„^<;^<o{A2; — A(A)}, and 
the supremum is also attained in this case. Now suppose that a = — oo and define 
similarly i = lim;^_j._oo A(A). If ^ < +00, then A*(z) = +00 which is a contradiction. 
The last case is £ = +00. As A is a convex function, the function 99 : A 1— )• Az — A(A) is 
a concave function so it is monotone for A < Aq small enough. If it is increasing, then 
A* (z) = sup;)^jj<_x<o{Az — A(A)} and we are done. If 99 is decreasing for A < Aq, then we 
can suppose A* (z) = lim^^-oo — A(A) and by assumption A* (z) = —z. But then for 
z[ < z, we have A*(z^) < limA-s>-oo ^{z[ — z) + Xz — A(A) = +00, which contradicts the 
fact that A{z'i) = —z'l- 

□ 
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